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Sampling Sparse Signals on the Sphere: 
Algorithms and Applications 
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Abstract —We propose a sampling scheme that can perfectly 
reconstruct a collection of spikes on the sphere from samples of 
their lowpass-filtered observations. Central to our algorithm is a 
generalization of the annihilating filter method, a tool widely used 
in array signal processing and finite-rate-of-innovation (FRI) 
sampling. The proposed algorithm can reconstruct K spikes from 
(K + y'K) 2 spatial samples. This sampling requirement improves 
over previously known FRI sampling schemes on the sphere by 
a factor of four for large K. 

We showcase the versatility of the proposed algorithm by 
applying it to three different problems: 1) sampling diffusion 
processes induced by localized sources on the sphere, 2) shot 
noise removal, and 3) sound source localization (SSL) by a 
spherical microphone array. In particular, we show how SSL 
can be reformulated as a spherical sparse sampling problem. 

Index Terms —Sphere, sparse sampling, diffusion sampling, 
sound source localization, annihilation filter, finite rate of in- 
novavtion, shot noise removal, spherical harmonics 

I. Introduction 

N UMEROUS signals live on a sphere. Take, for example, 
any signal defined on Earth’s surface 0-0- Signals 
from space measured on Earth @> 0 also have a spherical 
domain. In acoustics, spherical microphone arrays output a 
time-varying signal supported on a sphere |[6j, (7j, while in 
diffusion weighted magnetic resonance imaging fiber orienta¬ 
tions live on a sphere |8j. In practice, we only have access to a 
finite number of samples of such signals. Thus, sampling and 
reconstruction of spherical signals is an important problem. 

Just as signals in Euclidean domains can be expanded via 
sines and cosines, one can naturally represent spherical signals 
in the Fourier domain via spherical harmonics (9). A signal 
is bandlimited if it is a linear combination of finitely many 
spherical harmonics. Sampling bandlimited signals on the 
sphere has been studied extensively: for signals bandlimited to 
spherical harmonic degree L, Driscoll and Healy |9j proposed 
a sampling theorem that requires 4L 2 spherical samples. The 
best exact general purpose sampling theorem due to McEwen 
and Wiaux uses 2L 2 samples [ 101. Recently, Khalid, Kennedy 
and McEwen devised a stable sampling scheme that requires 
the optimal number of samples, L 2 Nil- 
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In this paper, we study the problem of sampling localized 
spikes on the sphere; in the limit, the spikes become Dirac 
delta functions. Such sparse signals on the sphere are encoun¬ 
tered in many problems. For example, various acoustic sources 
are well-approximated by point sources; the directional distri¬ 
bution of multiple sources is then a finite collection of spikes. 
Stars in the sky observed from Earth are angular spikes, and 
so are plume sources on Earth. 

Localized spikes are not bandlimited, so the bandlimited 
sampling theorems 0-® do not apply. In this paper, we 
propose an algorithm to perfectly reconstruct collections of 
spikes from their lowpass-filtered observations. Our algorithm 
efficiently reconstructs K spikes when the bandwidth of the 
lowpass filter is at least K + \[K. 


A. Prior Art 

Our work is in the same spirit as finite rate-of-innovation 
(FRI) sampling, introduced by Vetterli, Marziliano, and Blu 


1121. They showed that a stream of I\ Diracs on the line 
can be efficiently recovered from 2 K + 1 samples. Initially 
developed for ID signals, the original FRI sampling was 
extended to 2D and higher-dimensional signals in fl3| , 
and its performance was studied in noisy conditions [15), 


In a related work (T7), (T8), Deslauriers-Gauthier and 
Marziliano proposed an FRI sampling scheme for signals on 
the sphere, reconstructing K Diracs from 4A' 2 samples. Their 
motivating application is the recovery of the fiber orientations 
in diffusion weighted magnetic resonance imaging 0. 

They further show that if only 3AT spectral bins are active, the 
required number of samples can be reduced to 3A'. Sampling 
at this lower rate, however, relies on the assumption that 
we can apply arbitrary spectral filters to the signal before 
sampling. This is known as spatial anti-aliasing—a procedure 
that is generally challenging or impossible to implement in 
most applications involving spherical signals, where we only 
have access to finite samples of the underlying continuous 
signalsfU 

In many applications, the sampling kernels ( i.e ., the lowpass 
filters) through which we observe the spikes are provided by 
some underlying physical process ( e.g point spread functions 
and Green’s functions). These kernels are often approximately 
bandlimited, but we cannot further control or design the 
spectral selectivity of these kernels. This impossibility of 
arbitrary spatial filtering suggests that our goal is to reduce 
the required bandwidth, or more practically, to maximize the 
number of spikes that we can reconstruct at a given bandwidth. 


'This is not to be confused with spatial anti-aliasing in image downsam¬ 
pling, where we do have access to all pixels. 
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Recently, Bendory, Dekel and Feuer proposed a spherical 
super-resolution method | |20) , pTI, extending the results of 
Candes and Fernandez-Granda |22] | to the spherical domain. 
They showed that an ensemble of Diracs on the sphere can 
be reconstructed from projections onto a set of spherical 
harmonics by solving a semidefinite program, provided that the 
Diracs satisfy a minimal separation condition. When the Diracs 
are constrained to a discrete set of locations, their formulation 
allows them to bound the recovery error in the presence of 
noise. Our non-iterative (thus very fast) algorithm based on 
FRI does not require any separation between the Diracs. We 
also allow the weights to be complex, which may be important 
in applications (for an example on sound source localization, 
see Section |IV-C| l. On the other hand, we need to assume 
that the number of Diracs is known a priori (or that it can be 
estimated through other means), whereas in m no such 
assumption is necessary. 


B. Outline and Main Contributions 

We start by reviewing some basic notions of harmonic 
analysis on the sphere in Section III] We then present the main 
result of this work in Section Hill : A collection of K Diracs 
on the sphere can be reconstructed from its lowpass filtered 
version, provided that the bandwidth of the sampling kernel 
is at least K + yfK. This bandwidth requirement also implies 
that (K + \/K)' 2 spatial samples taken at generic locations 
suffice to reconstruct the K Diracs. We establish this result 
by constructing a new algorithm for spherical FRI sampling. 
Compared to \K 2 samples as required in a previous work 
ED’ our algorithm reduces the numbers of samples via a 
more efficient use of the available spectrum. For large K, 
the required number of samples is reduced by a factor of 
up to 4. The proposed algorithm is first developed for the 
noiseless case. Procedures to improve the robustness of the 


algorithm in noisy situations are presented in Section III-E 


and we compare the performance of the algorithm with the 
Cramer-Rao lower bound |23) in Section |III-F| Section m 
presents the applications of the proposed algorithm to three 
problems: 1) sampling diffusion processes on the sphere, 2) 
shot noise removal, and 3) sound source localization. These 
diverse applications demonstrate the usefulness and versatility 
of our results. We conclude in Section m 

This paper follows the philosophy of reproducible research. 
All the results and examples presented in the paper can be 
reproduced using the code available at http://lcav.epfl.ch/ivan. 
dokmanic 


II. Harmonic Analysis on the Sphere and 
Problem Formulation 

A. Spherical Harmonics 

We briefly recall the definitions of spherical harmonics and 
spherical convolution. The 2-sphere is defined as the locus of 
points in R 3 with unit norm, 

§ 2 = f {x G R 3 | x T X = 1} . 

In what follows, we often use £ to represent a generic 
point on the sphere. In addition to the standard Euclidean 


representation £ = [at, y, z] T , points on S 2 can also 
be conveniently parameterized by angles of colatitude and 
azimuth, i. e. , £ = (9 , </>), with 9 measured from the positive z- 
axis, and (f> measured in the xy plane from the positive x-axis. 
The two equivalent representations are related by the following 
conversion, 

x = sin(0) cos((/>), 

y = sin(0) sin(^), (1) 

z = cos {9). 

The Hilbert space of square-integrable functions on the 
sphere, L 2 (S 2 ), is defined through the corresponding inner 
product. For two functions f,g G L 2 ( S 2 ) we have 

</, 9) = f mW) d£, (2) 

Js 2 

where d£ = sin(0) dfl d(j> is the usual rotationally invariant 
measure on the sphere. With respect to this inner product, 
spherical harmonics form a natural orthonormal Fourier basis 
for L 2 (§ 2 ). They are defined as j9j 

Y e m (0, <j>) = N^p\ m] (cos (3) 

where the normalization constant is 

JV7* = (_i)(™+M)/2 / (21+ 1 ) (*~ M)! (4) 

* 1 j Y 4?r (* + M) ! ’ 

and P™ (x) is the associated Legendre polynomial of degree £ 
and order m. Note that different communities sometimes use 
different normalizations and sign conventions in the definitions 
of spherical harmonics and associated Legendre polynomials. 
As long as applied consistently, the choice of convention does 
not affect our results^ 

In this paper, we adopt the following definition 

d m 

PT{x) = - x 2 ) m ' 2 -^P e (x), for m > 0, (5) 

where Pe(x) is the Legendre polynomial of degree £ [241. 

Any square integrable function on the sphere, / e L^l S 2 ), 
can be expanded in the spherical harmonic basis, 

oo 

fM) = Z £ fT Y e m (0,<t>). (6) 

e=0 |m|^ 

The Fourier coefficients are computed as 

It = </, yn = f mw® ^ 

JS 2 

The coefficients [/™, (£, m) G l] form a countable set 
supported on an infinite triangle of indices, 

1= {(l,m)GZ 2 | \m\ < £} . (8) 

We say that / is bandlimited with bandwidth L if fp = 0 for 
£ ^ L. Often we think of L as the smallest integer such that 

2 It is common to write the spherical harmonic order m in the superscript. 
We will keep this convention for the associated Legendre polynomials Pj m L 
spherical harmonics Y J n , normalization constants N™ and the spherical 
Fourier coefficients /™. It is not to be confused with integer powers such 
as . 
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this holds. For a bandlimited function, the triangle X is cut off 
at l = L. In what follows, we use 

X L = f {(£, to) g Z 2 I 0 < e < L, |to| ss £} (9) 


to represent the spectral support of a bandlimited function 
with bandwidth L. The set 1/ contains L 2 indices, so we can 
represent the spectrum as an L 2 -dimensional column vector 


/ = 


f o 
Jo 


fr 


A 0 , A 1 , 


-L+1 


J L -1 


h- 




( 10 ) 


B. Rotations and Convolutions on the Sphere 

Let S0 3 denote the group of rotations in Er'; any rotation 
g G S0 3 is parameterized by three angles that specify rotations 
about three distinct axes. Thus we can write g = g(a,/3,'y)- 
The commonest parameterization is called Euler angles |25] , 
Counter-clockwise rotation of a vector x G R 3 about the 
z-axis is achieved by multiplying x by the corresponding 
rotation matrix. 


Rz(ot) 


cos a — sin a 0 
sin a cos a 0 

0 0 1 


where a is the rotation angle. Rotation matrices around axes 
x and y can be defined analogously. 

We use A(g) to represent the rotation operator correspond¬ 
ing to q, that acts on spherical functions. Thus for / a function 
on the sphere, A (g)f represents the rotated function, defined 
as 

[A(g)fm = fiQ^oO, (11) 


where p _1 is the inverse rotation of p, and by g -1 o £ 
we mean pre-multiplying by R(g~ x ) the unit column vector 
corresponding to £, cf. Q. Compare this definition with 
the Euclidean case where shifting the argument to the left 
(subtracting a positive number) results in the shift of the 
function to the right. 

There are various definitions of convolution on the sphere, 
all being non-commutative. One function, call it /, provides 
the weighting for the rotations of the other function h. A 
standard definition is then HD 


[f*h] (0 = 


J- f dg-f(go V ) ■ A(g)) h 

J so 3 / 

rf /(e oj ?)M ? _1 °0 d e. 

j7r J SOo 


(0 


( 12 ) 


where 77 G S 2 is the north pole. It is easy to verify that this 
definition generalizes the standard convolution in Euclidean 
spaces, with the rotation operator g playing the same role as 
translations do on the line. Because the spherical convolution 
is not commutative, it is important to fix the ordering of the 
arguments. In our case, the second argument— h in 0 — will 
always be the filter, i.e., the observation kernel. 

The familiar convolution-multiplication rule in standard 
Euclidean domains holds for spherical convolutions too. It 
can be shown |9] Theorem 1] that for any two functions 


f,h G L 2 (S 2 ), the Fourier transform of their convolution is 
a pointwise product of the transforms, i.e.. 


(/ - h)T 


4-7T 

2 £+ 1 


ft 


hi 


(13) 


We note that / can also be a generalized function (a 
distribution). In particular, we consider spherical Dirac delta 
functions, defined as HD 


0 ) = 


5(9 - e 0 )s{</> - M 

sin($) 


(14) 


and weighted sums of Dirac deltas. To lighten the notation, we 
often write <$(£;£o)- With the definition in ( [T4| , it is ensured 
that 

<5(4; € 0 ) d« - 1, V&eS’. (15) 


f 

J s 2 


C. Problem Formulation 

Consider a collection of K Diracs on the sphere 

K 

/(0 = £“**(£;&)> as) 

k= 1 

where the weights {au £ C}^ x and the locations of the Diracs 
{£k = (9k, 4>k))k=i are a H unknown parameters. Let y(£) be 
a filtered version of /(^), i.e., 

y(0 = [/* h](0, 

where the hlter (or sampling kernel) h(£) is a bandlimited 
function with bandwidth L. We further assume that the spher¬ 
ical Fourier transform of h(£) is nonzero within its spectral 
support, i.e., h™ A 0 for all £ < L. Given spatial samples 
of y(£), we would like to reconstruct /(£), or equivalently, to 
recover the unknown parameters {(Cfc, ^ 3 . 

Since the filtered signal y(£) is bandlimited, we can use 
bandlimited sampling theorems on the sphere ( e.g., ©) 

or direct linear inversion (see Section |III-A| ) to recover its 
Fourier spectrum yj n from its spatial samples of sufficient 
density. Using the convolution-multiplication identity in ( p~3] ), 
we can then recover the lowpass subband of /(£) as 

fr = \(2e+l)/(An)f 2 -{yr/h^, 

for 0 ^ i < L and \m\ ^ £. Being a collection of Diracs, 
/ ^ L 2 (§ 2 ), but its Fourier transform /J 71 can still be computed 
via 0 in the sense of distributions as 


K 

fr = 2 otkYfAhM 

fc =t ^ (17) 

= NT 2 a k P l e ml (cos 9k)e-^ k . 
k= 1 

The problems we address in this paper can now be stated 
as follows: Can we reconstruct a collection of K Diracs on 
the sphere from its Fourier coefficients in the lowpass 
subband Xl as defined in (0? If so, then what is the minimum 
bandwidth L that allows us to do so? In practice, the sampling 
kernel is often given and not subject to our control. In this case, 
the previous question can be reformulated as determining the 
maximum number of spikes that we can reconstruct at a given 
bandwidth L. 
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III. Sampling Spherical FRI Signals 

In this section we address the questions stated above. Our 
main result can be summarized in the following theorem: 

Theorem 1 . Let f be a collection of K Diracs on the 
sphere S 2 , with complex weights {a k } k=1 at locations 
{ 6 c = (9 k ,4> k )} k=1 , as in Convolve f with a bandlimited 
sampling kernel Hl, where the bandwidth L R K + \J~K, 
and sample the resulting signal [f * h k ] (f) at L 2 points 
L 2 

{f) n e § 2 } n _ 1 chosen uniformly at random on § 2 . Then almost 
surely the samples 

fn = [f * h L ](lp n ), n = 1, . . . , L 2 

are a sufficient characterization of f. 

We provide a constructive proof of this theorem by present¬ 
ing an algorithm that can efficiently recover K localized spikes 
from L 2 samples, where L R K + \/K. Before presenting the 
algorithm and the proof, we first define some relevant notation 
and state two lemmas. 

A. From Samples to the Fourier Transform 

Our algorithms perform computation with spectral coeffi¬ 
cients. In practice, we have access to spatial samples of the 
function, so we need a procedure to convert between the 
spatial and the Fourier representations. We first describe a 
method to compute the Fourier transform from samples taken 
at generically placed sampling points. 

Let the function / e if (if 2 ) have bandwidth L\ then we 
can express it as 

L —1 l 

/(M)= £ £ fTYPiW)- ( 18 ) 

£=0 m=—i 

Choose a set of sampling points e S 2 J N _ 1 , and let Y = 
[y n ,(Lm)] where y n ,u,m) = Y P{Tn)- Furthermore, let / = 
\f(if ’i), ..., f(ip n )] T be the vector of samples of /. We can 
then write 

f = Yf, (19) 

where f is the L 2 -dimensional vector of spectral coefficients 
as defined in ( fl0| ). The goal is to recover the spectral coeffi¬ 
cients /. We can recover / from f as soon as the matrix Y 
has full column rank. In that case, we compute 

f = yif, ( 20 ) 

where Y ' denotes the Moore-Penrose pseudoinverse of the 

matrix Y. 

In particular, if we draw the samples uniformly at random 
on the sphere, we can show that Y is regular with probability 
one: 

Proposition 1 . Draw N sampling points from any absolutely 
continuous probability measure on the sphere (e.g. uniformly 
at random). Then Y has full column rank almost surely if 
N R L 2 , that is, if it has at least as many rows as columns. 

The proof of this proposition is identical to that of Theorem 
3.2 in j28), and is thus omitted. 


The above result indicates that we can recover the spectral 
coefficients f™ in the lowpass region T k from L 2 samples 
taken at generic points on the sphere. The reconstruction 
requires a matrix inversion as in ( | 20 | ). 

Much faster reconstruction is possible when the function is 
sampled on certain regular grids. In that case, we can leverage 
the structure of Y to accelerate the matrix inversion. Such 
efficient schemes were proposed by Driscoll and Healy |9}, 
requiring 41 2 samples; by McEwen and Wiaux [ 101, requiring 
2L 2 samples; and most recently, by Khalid, Kennedy and 
McEwen I"]- requiring L 2 samples. 


B. The Data Matrix 

Using the definition of associated Legendre polynomials in 
0 . we rewrite the spherical harmonics ([3} as 


Y e m (9 ,0) = iVT 1 (sin 0) H 


dM 


d(cos 0 )l m l 


Pt (cos 9) 




( 21 ) 


where Nff = (-1 ) m N™. 

The essential observation is that the bracketed term in ( | 2 Tj ) 
is a polynomial in x = cos 6. At bandwidth L, the largest 
spherical harmonic degree is L — 1, so the largest power of 
x in ( | 2 T] i is L 1 as well. It follows that we can rewrite the 
derivative term as a linear combination of powers of x, i.e. 


Nf 


d\ r 


d(cos8)\ r 


r Pf(cos(9) = cj m x , 


( 22 ) 


where x = f [x i_1 , x L ~ 2 , • • • , x, 1] T , x = cos 9 and C( rn e 

R L contains the corresponding polynomial coefficients. 

Using the dot-product formulation ( f22] >, the spectmm of /, 
as given by G3- can be expressed as 

K 

It = cJm £ oc k x k (sin (23) 

k= 1 

where x k = f [x£ _1 , x£ -2 , , x k , 1] T with x k = cos 9 k , 

and we factored cj m out of the summation as it does not 
depend on k. 

A key ingredient in our proposed algorithm is what we call 
the data matrix A, formed as a product of three matrices, 


A = XAU , 


(24) 


where 

X = [ Xl , ■■■ , x K ]eR LxK , (25) 

is a Vandermonde matrix with roots cos 9 k , A = 
diag(ai,..., ap is the diagonal matrix of Dirac magnitudes, 
and we define 


U = [u km \ e K ifx ( 2i_1 ) ; (26) 

with u km = (sin 6 » fc )l m le- jm ^. 

It is convenient to keep a non-standard indexing scheme for 
the rows and columns of A, as illustrated in Fig. Rows of 
A, indexed by p, correspond to decreasing powers of cos 9 k , 
from p = L — 1 at the top, to p = 0 at the bottom; columns 
correspond to u km , with m increasing from —L + 1 on the 






DOKMANIC AND LU: SAMPLING SPARSE SIGNALS ON THE SPHERE 


5 


Spherical harmonic spectrum 


order m 


0 

II 





































































CD 



















U) 














































































































L 







i 


L 


1 














J 


Invertible linear mapping 



Fig. 1. Illustration of Algorithm [7] Spherical harmonic spectrum (A) is 
linearly mapped onto the shaded triangular part of the data matrix A 
(B). Columns of the data matrix are indexed from left to right by ra, 
— (L — 1) ^ m ^ (L — 1), corresponding to spherical harmonic order. 
Rows are indexed from bottom to top by p, 0 ^ p ^ (L — 1) corresponding 
to powers of cos 6. Note that the triangular part of the data matrix does not 
coincide with the spherical harmonic spectrum, although there is a one-to- 
one linear mapping between the two (see LemmafI). Existing results on 2D 
harmonic retrieval can exploit only a small part of the data matrix, for example 
the hatched square (see Section |IlI-D| Finally, sufficiently long columns of A 
are rearranged in the block-Hankel-structured annihilation matrix Z, whose 
nullspace contains exactly the sought annihilation filter, h (C). 


left, to L — 1 on the right. We see from ( |23[ i and ( |24| ) that 
computing any spectral coefficient ff 1 amounts to applying a 
linear functional on A as follows 

ST = C Jm Ae m = ( c <m e w A> p , (27) 

where e m e R 2L-1 is the vector with one in position m for 
—L < m < L , and zeros elsewhere, and ( • , • ) F denotes 
the standard inner product between two matrices, defined as 

<A B) F = Y (J "'At = trac e(A H B). 


The last expresion in \21\ implies that the spectral coef¬ 
ficient f™ can be obtained as an inner product between the 
data matrix A and a mask Cf rn eJ n that is overlaid over A. 
One can verify that the support of this mask for f™ is on the 
column corresponding to m, and on the rows corresponding 
to 0 < p < L — |m|. That means that certain parts of the 
data matrix are not involved in the creation of any spectral 
coefficient; consequently, they cannot be recovered from the 
spectrum. Nevertheless, we can recover a large part: 

Lemma 1. There is a one-to-one linear mapping between 
the spherical harmonic coefficients in the lowpass subband, 
\ff n , (l, to) e 1l\, and the triangular part of the data 
matrix A indexed by Jl = {(p,m) j 0 ^ \m\ < p < L) (with 
indexing as illustrated in Fig. |T|). 

Proof. It is straightforward to verify that all the masks Cf rn eJ n 
for 0 ^ |m| ^ £ < L are supported on the triangular part of 
A, as indexed by Jl. Because the number of such masks 
coincides with the number of entries in the triangular part, 
and no mask is identically zero, it only remains to show that 
the masks are linearly independent. For mi A m 2 , this is 
true because their supports are disjoint ( ej ni and e ^ 2 activate 
different columns), 

su pp(c£ 1 m 1 e m 1 ) n supp(c<? 2 m 2 e^ 2 ) = 0 (28) 

for any For i\ < f 2 and mi = m 2 = m, cj im x and 

cj 2 . m x are polynomials of different degrees (c.f (| 22 [i), 

deg(c[ im a;) = (h - |m|) < (h - M) = deg(cj 2 m a:, (29) 

where deg( • ) denotes the degree of the polynomial in 
the argument. Therefore, supp(c^ im e^) / supp(Q 2 TO 2 e^), 
and in particular C( 2 rn is linearly independent from all Q m 
such that f < £ 2 - This implies that all masks are linearly 
independent. Thus the mapping 

A [ (c ern ej n , A) f , 0 ^ \m\ < l < L] 

= [fT,0< M ^(<L] (30) 

is one-to-one on Jl. O 

C. Reconstruction by Generalized Annihilating Filtering 
Element of the data matrix A at the position ( p , m) (with 
reference to Fig. [TJi) can be expanded as 

K 

dpm = ^ a fc z£(sin0 fc ) W e- jm ^, (31) 

fc=i 

where p varies from 0 to L — 1, and m from —(L— 1) to 
(L — 1). For either positive or negative m, the sum ( |3T| is a 
sum of 2D exponentials. Lemma[T]implies that we can recover 
the shaded triangular part of the data matrix in Fig. [I] from 
the spectrum. In what follows, we propose a new algorithm to 
recover the parameters of the Diracs from that triangular part. 

def 

The vector = Ae m is a linear combination of columns 
of X , i.e., it is a linear combination of K exponentials with 
bases Xk, 

K 

dpm = ( 32 ) 

k =1 
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where Xk = cos {Ok). Similarly to standard Euclidean FRI 
sampling ( 12 ), we can use the annihilating filter technique 
to estimate the roots {xk = cos 0 fc}t=i °f these exponentials. 

Annihilating filter is a finite impulse response (FIR) filter 
with zeros positioned so that it annihilates signals of the form 
( |32| >. Consider an FIR filter II (z) with the transfer function 

H{z) f\( 1 - XkZ- 1 ) ^ £ h n z~ n , (33) 

k =1 n=0 


where h = [ho, hi, h^] T is the vector of filter 

coefficients. It holds that h*d m = 0 (see Appendix [A]) for any 
to, provided that d m is of length at least K + 1. Equivalently, 


[dn,m, dn — l^m, ■ ■ • ■ d n —x,m\ 


ho 

hi 

h K 


= 0 , 


(34) 


for n K. In our scenario, we do not know the bases of the 
exponentials {xk)k=i —they are exactly the parameters we aim 
to estimate. Thus we do not know the filter H(z) either. 

Up to a scaling factor, there is a unique (K + 1)- 
tap filter H(z) with the sought property. The orthogonal¬ 
ity relation a says that h lives in the nullspace of 
[d n ,m d n - i jm • • • d n -K,m\', we need at least K such vectors 
to make their joint nullspace one-dimensional, thus to pinpoint 
h. Once the filter coefficients are found, we can obtain the 
unknown parameters [xk] by root finding and using the 
factorization in ( [33] ). 

For the annihilating filter technique to be applicable, we 
need to ensure that all the colatitude angles 9k are dis¬ 
tinct. Furthermore, the form of our equations reveals that for 
9k e {0, 7 r}, Ukm = 0 for all to. In the parameterization 
<®, this is equivalent to setting oik = 0 , and it prevents 
us from recovering the corresponding Dirac. This behavior is 
undesirable, but we can guarantee that no Dirac sits on a pole 
by first applying a random rotation. This fact is formalized 
in the following lemma, which follows immediately from the 
absolute continuity of the Haar measure. 

Lemma 2. Consider a collection of Dirac delta functions on 
the sphere, f(f) = 2 fc=i £&), and a random rotation 

q drawn from the Haar measure on S0 3 (i.e. uniformly over 
the elements of the group). Then with probability 1, A(p)/ 
contains Diracs with distinct colatitude angles, 9 j A 9j for 
i A j, and no Dirac is on the pole, 9k $ {0, 7 r} for all k. 

We are now well-equipped to prove the main result. 


Proof of Theorem [7] We provide a constructive proof, sum¬ 
marized in Algorithm [l] First observe that L 2 random samples 
almost surely suffice to compute the spectral coefficients f™ 
in the lowpass subband T/ t with bandwidth L, as detailed in 
Section III-A (see Proposition [TJ. By Lemma [T] we can then 
compute the shaded part of A given the spectrum /. 

Our aim is to construct the annihilating matrix Z , structured 


as follows 


dL- 1,0 

dL- 2,0 

, a - 

1 

* 

1 

o 

dL- 2,0 

dL- 3,0 • • 

d>L-K- 2,0 

dap 

dx- 1,0 ’' 

^ 0,0 

dL- 2,1 

dL- 3,1 

a - 

t** 

1 

1 

to 

dL- 3,1 

dL- 4,1 

dL-K- 3,1 


( 35 ) 


Z is constructed by stacking segments of length (K + 1) 
extracted from the columns of A. From the annihilation 
property ([34]), it follows that the nullspace of Z contains the 
sought annihilating filter. 

The trick now is to count how many such segments we can 
get from the shaded part of A. For to = 0, p varies from 0 to 
L— 1. Therefore, we can construct L — K rows of the matrix 
Z. For to = 1, p varies from 0 to L — 2, so we can construct 
L — K — 1 rows of Z , and the same goes for m = — 1. This 
process is illustrated in Figs. 0 * and dp. Summing up, we get 
the total number of rows of Z that we can construct from the 
available spectrum. 


# = {L — K) + 2 x (L - K - 1) + • • • + 2 x 1 

= (L — K) 2 . 


(36) 


Z needs at least K rows, as we need a ID nullspace. Thus 


(L - K) 2 ^ I< 
=> K + VK. 


(37) 


In Appendix [C] we show that Z has rank K as soon as it has 
I\ or more rows. In other words, it has a one-dimensional 
nullspace, and thus the annihilating filter coefficients are 
uniquely determined, up to a scaling factor. 

We find the parameters [9k]^ =1 by taking the arc cosine of 
the roots of H(z). This procedure is well-posed because arc 
cosine is one-to-one on [0,7r]. To ensure that the roots are 
distinct, we apply a random rotation before the estimation, 
and the inverse of this random rotation after recovering all the 
parameters of the Diracs (invoking Lemma [2|. 

In order to recover the azimuths {<f>k}^ = i, note that after 
recovering the colatitudes, we can construct the matrix X, 
and compute AUe m for \m\ < L — K. The azimuths are then 
given as the phase difference between AUe o and AUe \. The 
magnitudes ctk are obtained simply as AUe q. □ 


I). Sampling Efficiency and Relation to Prior Work 

Our proposed sampling scheme and the spherical FRI sam¬ 
pling theorem by Deslauriers-Gauthier and Marziliano G3 are 
both naturally expressed in terms of the bandwidth L of the 
sampling kernel required to recover K Diracs. In our case, 
the bandwidth requirement is that it be at least K + \/K. 
This implies that we need at least (K + \/K) 1 spatial samples 
in order to recover the K Diracs. For comparison, the FRI 
sampling theorem of Deslauriers-Gauthier and Marziliano G3 
requires L ^ 2 K, and thus their algorithm recovers K Diracs 
given 4 K 2 samples. This is asymptotically four times the 
number of samples required by Algorithm |T| 
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Algorithm 1 Spherical Sparse Sampling 
Input: Spatial samples of / e L 2 (§ 2 ) with bandwidth L, 
number of Diracs I\ 

Output: Colatitudes, azimuths and magnitudes 
{(a k ,9k,<t>k)}k=i of the K Diracs 
1 : Sample a random rotation g ~ Haar(S0 3 ) 

2 : Apply g to /, / <— A (g)f (relabel sampling points) 

3: Compute the spectrum / from the rotated samples of / 

4: Form A from / using the inverse mapping of ( |30l > 

5: Form Z from A according to ( [35] ) 

6 : h <— Right singular vector of Z for smallest sing. val. 

7: Compute the colatitudes, (Ok)Jf =1 <— arccos[Roots(h)] 

S: Construct X from x k = cosd k according to ( [25] ) 

9: Using X in ( |24] i, compute AUe 0 and AUei 
10 : ( <j)k)k=i Angle[(A[/e 0 ) 0 (AUe i)] o See the note 

ll: {uk)k=i •*- AUe 0 

12 : Apply the inverse of g, = (@k, 4>k) <— P _1 ° £fc, Vfc 
[> Note: we use the symbol 0 to denote element-wise division 
of vectors. 


The difference in sampling efficiency can be explained by 
spectrum usage. Fig. [2] illustrates the portion of the spectrum 
used by the two algorithms. We can see that the proposed 
algorithm is more efficient in that it uses a larger portion of 
the available spectmm to reconstruct the Diracs. 

Similar problems have been considered in the literature 
on 2D harmonic retrieval [291. However, these earlier works 
assume that the entire data matrix is known. In our case, A 
is known only partially, as illustrated in Fig. Of 5 - To apply 
the existing results on 2D harmonic retrieval, we could use a 
square portion that falls strictly inside a half of the triangle, 
either for to ^ 0 or for to ^ 0. However, we can see in 
Fig. GP that this is an inefficient use of available spectrum, 
and it requires an unnecessarily high sampling density. 

As mentioned earlier, in most situations we do not get to 
choose L as it is fixed by the underlying physical process. 
Then the question is how many Diracs we can reconstruct 
given a kernel with a fixed bandwidth L. By solving L > 
K + \[K for K, we get that 

K s? L- (L+ i ) 1 / 2 + (38) 

In contrast, the algorithm in jT7] can reconstruct up to K = 
L/ 2 Diracs. 


E. Denoising Strategies 

Theorem [T] and Algorithm [T] provide a tool to recover sparse 
signals on the sphere in the noiseless case. We may apply 
several procedures to improve the robustness of the algorithm 
in the presence of noise. 

In general, if the samples are noisy then the annihilating 
matrix Z in © will not have a nontrivial nullspace. A 
simple and robust approach is to use the right singular vector 
corresponding to the smallest singular value of Z as the 
annihilation filter. Let Z = UHV H be the SVD of Z\ then 
we set h = vk+i- 


■ 


Proposed 



Fig. 2. Spectrum usage for different algorithms. Spectral coefficients used 
by our algorithms are shown hatched. Spectmm used by the algorithm of 
Deslauriers-Gauthier and Marziliano GI) is shaded green. In the example, 
the bandwidth is set to L — 12, so the maximum number of Diracs that 
can be recovered by Algorithm [T] is K = 9. The algorithm in |T7| recovers 
K = 6 Diracs. 


To further improve the algorithm performance, we can use 
the output of Algorithm [I] to initialize a local search for the 
minimizer of the £ 2 error between the spectrum generated by 
the estimated Diracs, and the measured spectrum. 


minimize 

(ak,6k,<l>k)£ =1 


Z E 


k =i 


(39) 


We note that directly solving ( [39] ) with a random starting point 
is hopeless due to a multitude of local minima. 


F. Cramer-Rao Lower Bound 

We evaluate the Cramer-Rao lower bound (CRLB) for the 
estimation problem. For simplicity we treat the K = 1 case, 
so that the minimal bandwidth is L = 2, and £ e {0,1}. We 
assume that the spatial samples are taken on the sampling grid 
defined by McEwen-Wiaux & given at this bandwidth as 


6 


7t/3 

7t/3 

=1 

co 

=1 

=1 

=1 

4> 


0 

0 

2tt/3 2tt/3 4tt/3 4tt/3 


Resulting expressions for elements of the Fisher information 
matrix are complicated, and there is no need to exhibit them 
explicitly. We give the details of the computation in Appendix 
|B| and we compute the CRLB numerically. The resulting 
bound is plotted in Fig. [3] for two different spike colatitudes, 
together with the MSE achieved by Algorithm [I] followed by 
the descent m As pointed out before, because our scheme 
is coordinate-system-dependent, the bound depends on the 
colatitude of the Dirac. 

IV. Applications 

To showcase the versatility of the proposed algorithm, 
we present three stylized applications: 1 ) sampling diffusion 
processes on the sphere, 2 ) 

shot noise removal, and 3) sound source localization with 
spherical microphone arrays. 























































Fig. 3. Comparison between the mean squared error (MSE) of the proposed 
algorithm in estimating the spherical location (#,</>), with K = 1, and the 
Cramer-Rao lower bound (CRLB), at two different colatitudes. Note that the 
bound is different for different colatitudes of the spike, due to parameterization 
dependence. MSE is shown for the output of Algorithm |TJ followed by the 
minimization of (39) using Matlab’s f mins ear ch function. 



Fig. 4. Estimating the release locations and magnitudes of diffusive sources 
on the sphere. We assume that the diffusive sources appear at time t = Os, 
and that we sample the field at time t = Is. Shape of the diffusion kernel 
as a function of 0 is shown in subfigure A for three different values of the 
coefficient k (in units of inverse time). The logarithm of the aliasing error ED 
is plotted as a function of the cutoff degree L in subfigure B. Subfigures C and 
D show a typical reconstruction result for k = 0.1 (2 sources) and k = 0.01 
(3 sources). Magnitudes of the sources are represented by the distance of 
the corresponding symbols from the sphere’s center. Blue diamonds represent 
true source locations and magnitudes, while red circles represent estimated 
source locations and magnitudes. The sphere color corresponds to the value 
of the function induced on the sphere by the sources (red is large, blue is 
small). Signal-to-noise ratio in both C and D was set to 30 dB. We used the 
approximate bandwidth of L = 7, so that the number of samples taken in 
either case was 49. 


A. Sampling Diffusion Processes on the Sphere 

The diffusion process models many natural phenomena. 
Examples include heat diffusion and plume spreading from 
a smokestack. Often, the source of the diffusion process is 
localized in space, and instantaneous in time. Sampling such 
processes in Euclidean domains has been well studied p0]|- 
(32). 


Diffusion processes on the sphere are governed by the 
equation (33) 


ps 

kAv(£,t) = — v(£,t), (41) 

where A is the Laplace-Beltrami operator on 8 2 , and k is 
the diffusion constant. In the spherical harmonic domain, this 
becomes 

(42) 

giving the solution 

v?(t) = e~^ +1)kt vT (0), (43) 


where v™ (0) is the spectrum of the initial distribution. There¬ 
fore, we interpret the term e~ e ^ +1 ^ kt S m fi as the spectrum of 
the Green’s function of the spherical diffusion equation. In 
other words, it is the spectrum of the diffusion kernel on the 
sphere. Then (43) should be interpreted as the convolution 
between the kernel and the initial distribution. 

We consider the case when the diffusion process is initiated 
by K sources localized in space and time, i.e., the initial 
distribution in (43) is 

K 

v(£;o) = £ (44) 

k= 1 


We show how to use the proposed sampling algorithm to 
estimate the locations and the strengths of the sources from 
spatial samples of the diffusion field taken at a later time to- 
Recovering all parameters (locations, amplitudes and release 
times) of multiple diffusion sources is a challenging task 
1301. To focus on the proposed sampling result, we make 
the simplifying assumption that the K sources are released 
simultaneously, and at a known time (t = 0). In principle, 
the more challenging case of unknown and different release 
times can be handled by adapting the techniques derived in 
(3T), gg, but these generalizations are out of the scope of 
this work. 

In the spatial domain, the diffusion kernel at time to after 
the release is given as 


oo 

h di{ (£; 1 0 ) = e-^ + V kt °Y e °(Z). (45) 

e=o 

Combining (45} with (43) and the spherical convolution- 
multiplication rule m we get 


v(£; to) = v{(,] 0) * h dii (£; t 0 ) 

K 

= E c k[Mek)h d if( -;io)](0- 


This signal is a sum of rotations of a known template. The 
diffusion kernel in © is not exactly bandlimited, but it 
is approximately so. We can therefore apply the spherical 
FRI theory and Algorithm [I] to recover the locations and the 
magnitudes of the diffusive sources. 

Fig. [4}A shows the shape of the symmetric diffusion kernel 
as a function of the colatitude 9. The high degree of smooth¬ 
ness is reflected in an approximately bandlimited spectrum. 
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This is demonstrated in Fig.[4]B, where we see that the aliasing 
energy due to spectral truncation, defined as 


e(L) 



00 


z 



2 £ + 1 ’ 


(47) 


rapidly becomes negligible as we increase the cutoff band¬ 
width L. Figs, [dp and [4j.) demonstrate accurate reconstruction 
of the localized diffusion sources at two different values of the 
diffusion coefficient (the detailed parameters of the numerical 
experiment are given in the figure caption). 


B. Shot Noise Removal 


Suppose that we sample a bandlimited function on the 
sphere, but a small number of samples are corrupted—they 
contain shot noise —due to sensor malfunction. Moreover, the 
identities of the malfunctioning sensors are not known a priori. 
Can we detect and correct these anomalous measurements? We 
show that our sampling results can be applied to solve this 
problem, provided that the number of erroneous sensors is not 
too large and that the original sampling grid is oversampling 
the bandlimited function. A similar idea was used in (34) to 
remove shot noise in the ID Euclidean case. 

For this application we assume that the samples 
are taken on a uniform grid on the sphere, 
{(0 p , 4> q ) | p,q e Z,0 ^ p < 2 L', 0 ^ q < 2 L'}. debited 

by 

7J7T / nr 

(48) 


0 P = 


pit 


2 U 


qtt 

U' 


Imagine now that we sample / on this sampling grid. Some 
samples are corrupted, so we measure g(6 p , cj> q ) = f(0 p , <f) q ) + 
s pq , where 

nonzero (p, q) e S 

S pq = i ’ . (49) 

zero otherwise, 


and S holds the indices of the corrupted samples. We will 
leverage an elegant quadrature rule by Driscoll and Healy (9|: 

Theorem 2. /[9j Theorem 3] Let f be a bandlimited function 
on § 2 such that ff 1 = 0 for £ L'. Then for (l, m) e II* we 
have 


2L'-1 2L'—1 

It = Z Z ( 50 ) 

p =0 q =0 

where the weights a p L 1 are defined in 

In other words, the Fourier coefficients f™ can be expressed 
as a dot-product between weighted sample values and the basis 
functions evaluated at the sampling points. In analogy with 
the Euclidean case, we now observe that the lowpass portion 
of the spectrum of / coincides with the lowpass portion of 
the spectrum of the generalized function obtained by placing 
weighted Diracs at grid points. Let / be bandlimited so that 
f™ = 0 for £ > L. Let further L < L'\ that is, the grid (48} 
oversamples /. Then the spectral coefficients can be expressed 
as the following inner product. 


It 


( Z xr V 

\p,q=0 / 


(51) 


for £ < L, \m\ ^ £. 

This is the key insight. Notice that the lowpass portion of 
the spectrum of g (for £ < IJ) can be written as 


n m 

9t 



z 

\(p,q)eS 


a p lpq^8 p ,lt> q 



(52) 


But f™ = 0 for £ ^ L, so the portion of the spectrum for 
L ^ £ < L' contains only the inhuence of the corrupted 
samples. 


z 

(p,q)€S 


fiL') 


SpqSg p 



L^£<L'. (53) 


Consequently, we can use this part of the spectrum to learn 
which samples are corrupted, and by how much. This is the 
subject of the following proposition. 


Proposition 2. Let f be a signal on the sphere of bandwidth L. 
Then we can perfectly reconstruct f from corrupted samples 
taken on the grid (48}. as long as the number of corruptions 
K satisfies 

K^L' -L- sjL’ - L + 1 + 1. (54) 


Proof As discussed in Section III we can use any m = const, 
line in the spectrum to get the rows of the annihilation matrix. 
However, we brst need to compute the corresponding columns 
of the data matrix. From Fig. [5] we see that the middle columns 
cannot be used for shot noise removal: we seek columns 
inbuenced only by corruptions. But the middle columns of 
the data matrix are obtained from the middle spectral columns 
(for to < L), so they are inbuenced both by the desired signal 
and the corruptions. This means that we can only use spectral 
bins for L < to < L', as illustrated in Fig. [5] For to = L and 
to = — L, the number of segments of length K + 1 that we 
can get is L' — L — K. For m = L + 1 and to = — (L + 1) 
it is L' — L — K — 1, and so on. Summing up we have that 
the total number of consecutive segments of length K + 1 we 
can use is 


# = 2(L' - L-K) + 2(L' -L-K- 1) + • • • + 2 ■ 1 
= (L' — L — K)(L' -L-K+ 1). 

We need this number to be at least K, because we need K 
rows in the annihilation matrix. We thus obtain the claim of 
the proposition by solving the inequality # 5 - K. □ 

After detecting the corrupted readings, we can use the 
estimated corruption values to estimate the function. Another 
option is to simply ignore them altogether, as we have more 
samples than the minimum number thanks to oversampling. A 
shot noise removal experiment is illustrated in Fig. [6] 


C. Sound Source Localization 

Spherical microphone arrays output a time-varying spherical 
signal. If the signal is induced by a collection of point sources, 
we can use the proposed spherical FRI sampling scheme to 
estimate the directions-of-arrival (DOAs) of the sources. For 
simplicity, we consider the narrowband case, i.e., the sources 
emit a single sinusoid. 
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Colatitude 0 Spherical harmonic degree 

Fig. 7. Multiple DOA estimation by a spherical microphone array. First row of subfigures corresponds to fa = 1000 Hz, and second row to fa = 4000 Hz. 
Sphere has a radius r = 0.2 m, and the source is located at [0, 0, 3] T m. The real and imaginary parts, and the absolute value of the Green’s function are 
shown in subfigures A and D. Real part, imaginary part and absolute value of the spectrum are shown in subfigures B and E. Subfigures C and F show the 
simulation results for K = 2 and K = 5, and random source placement. Blue diamonds represent the source locations, and thick red lines show the estimated 
directions. Size of the sphere is exaggerated for the purpose of illustration. The sphere color corresponds to the absolute value of the function induced on the 
sphere by the sources (microphones measure samples of this function). The bandwidth was set to L = 12 at 1000 Hz and to L = 30 at 4000 Hz. 


E Lowpass signal / □ Used for shot-noise 



Fig. 5. Spectrum structure in shot noise removal. Green-shaded bins get 
contribution from the desired signal / with bandwidth L; hatched bins are 
influenced by the shot noise; red-shaded columns are (i) long enough to 
annihilate shot noise and (ii) recoverable from the corrupted spectrum. 


How does this example fit into our sparse sampling frame¬ 
work? In spherical microphone arrays, the microphones are 
distributed on the surface of a sphere, either open or rigid (7;|. 
Therefore, the microphone signals represent samples of a time- 
varying function on § 2 . If a sound source emits a sinusoid, 
every microphone measures the amplitude and the phase of 
that sinusoid shaped by the characteristics of the propagating 
medium and of the spherical casing. Equivalently, for every 
microphone we get a complex number. 

Suppose that a source of unit intensity is located at s, and 



Fig. 6. Shot noise removal via spherical FRI, for L = 6, I! = 12 and 
K = 4 malfunctioning sensors. Corrupted signal is shown in subfigure A, 
together with the true corruption values (blue diamonds) and the estimated 
corruptions (red circles); same signal with the shot noise removed is shown 
in B, with the correct sample values at the corrupted locations denoted by 
blue diamonds. 


that the microphones are mounted on a rigid sphere of radius r 
with center at the origin. The response measured by the micro¬ 
phone at r, such that ||r|| = r, is given by the corresponding 
Green’s function. For a wavenumber n = 2ixv/c, where v is 
the frequency and c is the speed of sound, the Green’s function 

is 0 

•7 OO 

g{r\s,n) = j- b e (Kr)h < ' e L \ns)(2£+ l)P e (cosa rs ), (55) 
47r e=o 

where is the spherical Hankel function of the first kind 
and of order l. Pi is the Legendre polynomial, and cos a rs = 
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Fig. 8. Ratios of Green’s functions. We computed the Green’s function for 
nine different source distances (1.0 m, 1.2 m, 1.4 m, 1.6 m, 1.8 m, 2.0 m, 3.0 
m, 4.0 m, 5.0 m). Then we plotted the magnitude of the ratio of the Green’s 
function at each distance and the Green’s function at the largest distance (5 
m), both in space (B) and in the spectrum (C). The more parallel the ratio 
curve is with the abscissa axis, the more similar the Green’s function at that 
distance is to the Green’s function at 5 m. Curves are plotted in the order 
of increasing distance in the direction of the dashed arrow (up to down), as 
indicated in (A). 


^-(r,s). Mode strength bg(kr) is defined as 

be(nr) = je(Kr) -(56) 

h\ (nr) 

where jg is the spherical Bessel functiorfjof order £, and prime 
(•)' denotes the derivative with respect to the argument. 

The Green’s function g should be seen as a filter that 
describes how the point source’s influence spreads over the 
sphere. It is shown for two different frequencies in Figs. [7]4 
and [7p, while the corresponding spectra are given in Figs. 
|7j3 and [7J:. We see that the absolute pressure on the sphere 
has a similar shape for both frequencies, but the real and 
imaginary parts vary faster at higher frequencies, implying 
higher bandwidth. In both cases we observe that the Green’s 
function is approximately bandlimited. 

Assume now that there are K sound sources at locations 
with complex intensities {a k )Y v The resulting 
measurement by a microphone at point r is 

K 

/( r ) = Y ong{r\a k ,K). (57) 

fc=l 

If all the source locations ,s k are at the same distance 
from the sphere, then the Green’s function ( |55j ) depends 
only on the angle between r and s. For some fixed source 
distance d, we can define (issl(£) = f g( x £ r \ x rid, K )- where 
Xf denotes the unit vector corresponding to £, x v the unit 
vector corresponding to the north pole //, and the subscript SSL 
stands for sound source localization. Then ( |57) > corresponds to 
a weighted sum of K rotations of a known template function 
h-SSL, 

K 

/(£) = Xj ^^sslC^ 1 O £). (58) 

k= 1 


3 We use the standard symbol jg for the spherical Bessel function. Note the 
subtle difference from the imaginary unit j. 


As it is unrealistic to assume that the sources are all at the 
same distance, we hope that the shape of g(r\s, n) does not 
(strongly) depend on ||s||. Indeed, it turns out that the shape is 
approximately preserved within a certain range, as illustrated 
in Fig. [8] We therefore suppress the dependency of g on ||s|| 
and approximate ([57]) as follows. 


K 


/(0 = Y ot k g(t\s k ,n) 
k =1 
K 

« Y Sfc/issiXPfc 1 ° o 


(59) 


k = 1 
K 


Y ®kK£,\ts k ) 


k =1 


* ^ssl(C)- 


Here, we absorbed a k and additional (complex) scaling due 
to different distances into a k , and hssh is computed at some 
predefined mean distance. 

We thus reduced the sound source localization problem to 
a problem of finding the parameters of a weighted sum of 
Diracs. In order to apply our spherical FRI algorithm, we need 
to verify that g is bandlimited on the sphere. Figs.[7p and [7]i 
show that it is indeed approximately bandlimited, and that the 
bandwidth depends on the frequency (it also depends on the 
sphere radius). 

Figs. HP an d|Z P show an example of recovering two sources 
at 1000 Hz and 5 sources at 4000 Hz using the proposed 
spherical sparse sampling scheme. It is worth noting that 
this succeeds in spite of the model mismatch due to varying 
source distances. This indicates the robustness of the proposed 
reconstruction algorithm. 


V. Conclusion 

We presented a new sampling theorem for sparse signals 
on the sphere. In particular, by leveraging ideas from finite 
rate-of-innovation sampling, we showed how to reconstruct 
sparse collections of spikes on the sphere from their lowpass- 
filtered observations. Compared to existing sparse sampling 
schemes on the sphere, we use the available spectrum more 
efficiently by generalizing known results on 2D harmonic 
retrieval, thereby reducing the number of samples required to 
reconstruct the parameters of the spikes. 

We illustrated the usefulness of our algorithm by using it to 
solve three problems: sampling diffusion processes, shot noise 
removal, and sound source localization. But there is a wealth 
of other applications, for example in astronomy. Just think 
about the numerous spherical signal processing challenges put 
forward by the square kilometer array (SKA) project |35) . 

We mentioned some approaches to estimation from noisy 
samples, but more efficient denoising schemes should be 
studied. One example, effective in the Euclidean setting, is the 
Cadzow denoising algorithm |36|. The problem seems more 
challenging on the sphere; in particular, the annihilating matrix 
is block-Hankel, rather than Hankel. 














12 


Appendix 


We can compute the three required derivatives 


A. Annihilating Property 

For the sake of completeness, we show in this appendix 
that the annihilation filter annihilates linear combinations of 
exponentials. We compute the response of the filter H(z) in 
m to a signal of the form y n = XlfLi bk%k as 

K K / K \ 

(y *h) n = Z Vn-mhm = Z ( Z ) h ™ 

m =0 m =0 \k= 1 ) 

K K K K 

= Z Z h ™ x k m = Z sSfclK 1 _ xkx i^ 

k= 1 m =0 k= 1 i= 1 

= o. 


%^=Z Z Yr(Oo,MYr(o n ,K), 


da 0 
SMC) 

d6 n 


e=o |m|^r 

-V 2 

£=0 |m|^£ 


to cot 9 o Y ( m (0 o ,(j>o) 


+ VO - m){l +m+ l)e^°Y e m+ \d 0 , fa) 


Yr{e n An), 


SMC) 

d(j) 0 


L—l 


= «oZ Z H m ) Y M d o, MY e m (e n , 4> n ). 


1=0 |m|^£ 

NowVL=|^J^, j^, , and the Fisher information 


matrix is 


B. Computation of the Cramer-Rao Lower Bound 

A lowpassed collection of A' Diracs can be written as 
follows, 


L—l 


K 


m<t>)= Z Z Z otkYMSk^k) (60) 


^=0 m-=—t \k =1 


1 

1(0 =E[VA(C)VA(C) ff ] = ^ Z v /n(C)V/„(C) ff . 

n= 1 

Let £ be any unbiased estimator of the parameters The 
CRLB can then be computed as 

cov(C) > /(C)” 1 . (67) 


In the remainder of this section, we assume K = 1, so we 
rewrite the function as 

L—l l 

M<f>) = Z Z OoY t m (0o,<h)yr(S,<l>)- ( 61 ) 

£=0 m=—l 

We take samples on the sphere at the locations 
{(9n,4>n)}n=i- The nth sample is given by 

Bn = fiPni $n) A £n (62) 

where e n ~ A/"(0, cr 2 ), and they are iid. By C = [cto, 0o> </>o] T - 
we denote the vector of parameters we estimate. To make the 
dependence on f explicit, we rewrite ( |62) slightly as 

Bn /n(C) "f ^ n■ (63) 

With this notation in hand, we can write the conditional 
probability density function of the nth measurement as 

p(bIC) = J_ e~ [;j ~ / " (C)]2/(2CT2) , (64) 

V 27TCT z 

so that the log-likelihood function is 

L(C) = f lnp(fti,..., Bn\C) 

N 

= Z [-5 ln ( 27rfj2 ) - (Bn-/n(C)) 2 /( 2(j2 )] ■ (65) 


C. Rank of the annihilating matrix 

In this appendix, we show that the rank of the annihilating 
matrix Z ((35) is K with probability one, as soon as it has at 
least K rows. It then follows follows that the annihilating filter 
h is uniquely determined, up to a scaling factor, by solving 
Zh = 0. 

Consider the factorization A = XAU, 



where Xu = cos Ok and Uk,m = (sin 0 fc)l m le _jm< (' fc . 

To construct the annihilating matrix Z as in equation ( [35) , 
we create Hankel blocks from columns of A. The (L — K ) x 
(.K + 1) Hankel block corresponding to the middle (to = 0) 
column of A can be factored as 


Bn = 


L-K-l 


Xi 

1 


L-K-l 

l K 


X K 

1 


Oil 


a K 


„K 


r K 

l k 


= X 0 AS. 


„ o 
X K 

( 68 ) 


Consequently, differentiating L with respect to any entry f, of 
£ gives 


8L 1 " df n (C) 
S(i ^ 5 Ci 


( 66 ) 


The second block of the annihilating matrix obtained from the 
column corresponding to to = —1 is similar, 

B-i = XiTLxAE, (69) 
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def 

where Y _i = diag(ui i _i,..., uk-i), and X rn is obtained 
by removing the m leading rows from X (> . Then we can write 


z 

= [ 

s 0 T , 

Blx, 

Bj, ..., 

BJc-l+ i, B~[_ k _ i] t 






I 

■ AS ~ 







■ Y_x 

■ AS 




= 



■ Yx 

■ AS 


(70) 



X L . 

-K- 1 

■ Yk-l+x 

■ AS 





_X L 

-K- 1 

■ Y l _ k _x 

■ AS 



with 

the 

AS 

factor 

being common for all row-blocks 

We 


want to show that the nullspace of Z has dimension one. To 
that end, we just need to establish that the following matrix. 


*o ' 1 

X 1 ■ Y__ i 

• Y 1 

Xl-k-i ■ Yk-l+i 
X-L—K—l ' Y l _ k _i 


(71) 


has full column rank. To see why this is the case, let v be 
a non-zero vector such that 0 = Zv = T(ASv). It then 
follows from the full-rankness of T that ASv = 0. Since A 
is a diagonal matrix with non-zero entries on the diagonal and 
E is a K x (A'+l) Vandermonde matrix with distinct roots, the 
vector v is uniquely determined up to a multiplicative factor. 
We now show that the matrix T indeed has full column rank 
almost surely. 

Any column in T J is of the form 


(cos(9i) r (sin<2i)l s le^ lS 

(cos 0 K-) r (sin 0 K -) |s| e j ^ s 


(72) 


where 0 < r < L— K— |s| and — (L—K— 1) < s L—K—l. 
If the locations of the Diracs are random, we can use the 
following lemma to show that the matrix T will have full 
column rank with probability one. 

Lemma 3. Draw [£& = (9k,4>k)\k =i independently at random 
from any absolutely continuous probability distribution on 
TZ = [0,7r] x [0, 27t] (w.r.t. Lebesgue measure). Let A4 = 
{(ri, si),..., (rjv, sjv)} be a set of distinct integer pairs and 
let G = [g pq \, where g pq = (cos0 p ) r< 1 (sin0p)l s «le j,i!,s ‘!. Then 
G has full rank almost surely. 


Proof. This proof is parallel to that of Theorem 3.2 from |28| . 
Let Gm be the upper left M x M minor of G. We define the 
bad set Bm as the set on which Gm is singular, 

B m = {( 6 , • • • ,Cm) e | det G m = 0 } . (73) 


measure zero. Let (Ci,... , Cm) £ Bm, Gm is invertible. 
Because it is invertible, there exists a unique coefficient vector 
b = b(G ,... ,Cm) such that 


Gj\jb = 3 m+ 1 ) (74) 

where by gM+i we denote the first M entries of the last 
column of Gm+x- The bigger matrix Gm+i will be singular 
if and only if the same linear combination is also consistent 
with its (M + l)st row. In other words, Gm+x is invertible if 
and only if Cm+i is not in the set 

£m(Ci,---,Cm) = | (0,0) 

(cos 0) rM+1 (sin 0 )^ SM+ 1 le j ^ SM+1 
M 

= Y i bi{ cos 0 ) ri (sin 0 ) |ai| e i ^ i 
2=1 

For fixed (Ci,..., Cm), this is the set of zeros of a particular 
(generalized) trigonometric polynomial, thus it has measure 
zero. Note that the definition of Zm makes sense only for 
(Ci,..., Cm) ^ Bm, as otherwise Gm is not invertible. Thus, 
the solution b to ( f74| may not exist. 

Consider now the following two sets: 

Hpf 

Um+i = {(Ci, • ■ ■ ,£m+i) | (Cl, ■ ■ •,C m) e Bm,£m +i e IZ} 

and 

Vm+i == {(Ci, • ■ • ,Cm+i) | (Ci, • • ■ ,Cm) e A m ,Cm+i e Z M } ■ 

The bad set Bm +i must be a subset of the set U uV. But we 
just showed that the set V has measure zero; by the induction 
hypothesis, U also has measure zero. Thus their union, too, 
has measure zero. 

It follows that Bm +i has measure zero. Finally, because the 
distributions of Ci are absolutely continuous w.r.t. the Lebesgue 
measure, so is their product distribution. Hence the probability 
that (Ci, • • •,C k) lies in the zero-measure set Bk is zero. □ 

To complete the argument, note that the matrix T J has the 
same form as the matrix G in the statement of Lemma [3] with 
0^r<L-K~\s\ and — (L - K-l)^s^L-K-l. 
Thus, the columns of T are independent with probability one, 
provided that its number of rows is at least K. 
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